Stock Prediction Over 90 Percent Accuracy?

by: Panagiotis Petridis, 8 years ago

Last edited: 8 years ago

I was recently trying out all the available algorithms to see which one worked the best at predicting the direction of a stock("GOOG"). I came accross Nearest Centroid Classification and decided to give it a shot. At first I ran it and got an expected 49 - 46% Accuracy. Then I found the "shrink_threshold" parameter and changed it to 0.5. Then for some crazy reason I started getting over 90% accuracy. Specifically over 98% most of the times while getting 0.995 And that was at the Train Data, split 80-20%( I have the code bellow so you can try it out ). Then I understood that something was wrong. I guess that the algorithm found out how the label was set (if Open>high then -1 else 1). Now I would like to use the algorithm to predict future direction of stock. To see if it has as good of accuracy as it does now (I don't think so but even a 60% would be impressive). What should I change at the data or the code?

Source Code:

import pandas as pd
from sklearn.neighbors.nearest_centroid import NearestCentroid
import numpy as np
from sklearn import cross_validation

df = pd.read_csv("GOOG.csv")

df = df.drop('Adj Close', 1)
df = df.drop('Answers', 1)
df = df.drop('Date', 1)

df['HL_PCT'] = (df['High'] - df['Low']) / df['Close'] * 100.0
df['PCT_change'] = (df['Close'] - df['Open']) / df['Open'] * 100.0

answers = pd.read_csv('GOOG.csv')
answers = answers[['Answers']]

X = np.array(df)
y = np.array(answers['Answers'])

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)

clf = NearestCentroid(shrink_threshold=0.5)
clf.fit(X_train,y_train)

print(clf.predict([740.669983, 742.609985,  737.500000,  739.770020,  1030700, 736.256673, -0.899963,  739.960002,  0.690753,   -0.121507]))

print(clf.score(X_test,y_test))


Data Structure:
input Matrix (X)

    Open              High               Low                Close              Volume    10d MA  
0  772.710022  778.549988  766.770020  768.789978  3796000  741.000000  
1  747.039978  748.650024  739.299988  745.909973  3266500  740.000000  
2  738.280029  744.460022  737.000000  741.770020  1497100  747.690003  
3  739.039978  741.690002  734.270020  738.419983  1179400  736.856669  
4  740.669983  742.609985  737.500000  739.770020  1030700  736.256673  

      Gain            Resistance    HL_PCT     PCT_change  
0   -3.920044  771.369995  1.532274   -0.507311  
1   -1.130005  744.619995  1.253507   -0.151264  
2    3.489991  741.076681  1.005706    0.472719  
3   -0.619995  738.126668  1.004846   -0.083892  
4   -0.899963  739.960002  0.690753   -0.121507  


Output vector(y)

           Answers
0         -1
1         -1
2          1
3         -1
4         -1


Data as .csv
https://drive.google.com/file/d/0B_r3mf9HbUrLcmtQY1VTblNOd1k/view?usp=sharing

Any info, resources and help on how can I predict future stock price direction with classification is much appreciated.



You must be logged in to post. Please login or register an account.



Unfortunately, the reality of the situation is you cannot predict stock prices, certainly not with anything from scikit-learn. Even when you get something like 90% with a stock like GOOGL, that's because GOOGL is just a solid up-ward trend. You'd do better just buying and holding GOOGL, rather than trying to trade it.

Many brilliant people have dedicated their lives to predicting stock prices using machine learning. Lots of Google Scholar papers on the topic. If you're going to predict stock prices, you're going to do it by analyzing the company itself, not just by looking at patterns of prices. You can use ML algorithms to analyze fundamentals to find over-sold or over-bought companies, but that's just about it.

-Harrison 8 years ago

You must be logged in to post. Please login or register an account.


That's ok. Thankfully there are a lot of things Machine Learning can do other than Stock Prediction. I might look into some papers though from Google Scholars though. Thanks again for the info.

-Panagiotis Petridis 8 years ago

You must be logged in to post. Please login or register an account.